This project aims to provide insight into the conditions of the property market in Jabodetabek. The data used are the results of a quick sample of ± 10,000 houses offered on one of the marketplace sites in Indonesia in September 2020. The values shown do not necessarily represent the actual prices. This project is modified from Geospatial Analysis in R workshop by Algoritma Data Science School.
The final result of this project is a simple Jabodetabek housing dashboard which can be seen here. It looks like this:
Figure 1.1: Snapshot of the final dashboard
In this Library and Setup section you’ll see some code to initialize our workspace, and the packages we’ll be using for this project.
Packages are collections of R functions, data, and compiled code in a well-defined format. The directory where packages are stored is called the library. R comes with a standard set of packages. Others are available for download and installation. Once installed, they have to be loaded into the session to be used.
You will need to use install.packages() to install any packages that are not yet downloaded onto your machine. To install packages, type the command below on your console then press ENTER.
## DO NOT RUN CHUNK
# packages <- c("rgdal","sf","tidyverse","glue", "plotly", "maps","leaflet","leaflet.extras", "tmap", "flexdashboard","DT")
#
# install.packages(packages)Then you need to load the package into your workspace using the library() function. Special for this course, the rmarkdown packages do not need to be called using library().
Before starting the analysis, first, we have to read the data:
#> id kota kecamatan harga KT KM m2
#> 1 1 Jakarta Pusat Gambir 3900000000 3 2 169
#> 2 2 Jakarta Pusat Kemayoran 1450000000 3 2 160
#> 3 3 Jakarta Pusat Gambir 11100000000 10 8 720
#> 4 4 Jakarta Pusat Kemayoran 485000000 2 1 35
#> 5 5 Jakarta Utara Tanjung Priok 5600000000 6 5 240
#> 6 6 Jakarta Barat Tamansari 3000000000 10 5 180
We want to compare the house pricing for each sub-district (Kecamatan) level. It’s better use the price per square meter since the size of the listed properties may vary:
# Make a dataframe that shows the house pricing for each sub-district (`Kecamatan`) level
df_agg <- df %>%
mutate(
harga_m2 = harga / m2
) %>%
group_by(kota, kecamatan) %>%
summarise(harga_m2 = median(harga_m2),
total_listings= n()) %>%
ungroup()
head(df_agg)#> # A tibble: 6 x 4
#> kota kecamatan harga_m2 total_listings
#> <chr> <chr> <dbl> <int>
#> 1 Bekasi Tarumajaya 12281250 463
#> 2 Depok Beji 11746032. 84
#> 3 Depok Cimanggis 11679365. 110
#> 4 Depok Cinere 12761905. 88
#> 5 Jakarta Barat Cengkareng 13040000 455
#> 6 Jakarta Barat Grogolpetamburan 15032680. 223
In order to make the data can be displayed in map, we need spatial data. This time I used shapefile of Indonesia provided by GADM. Indonesia’s spatial vector is provided up to 4 levels of granularity. In this case, we use a level of spatial vector that contains Kecamatan.
Let’s read the spatial data, then display it:
#> Reading layer `idn' from data source `D:\Learn Data\AA Data Competition\Portofolio R Geospatial\rgeo-intro-master\shp' using driver `ESRI Shapefile'
#> Simple feature collection with 6694 features and 16 fields
#> geometry type: MULTIPOLYGON
#> dimension: XY
#> bbox: xmin: 95.0097 ymin: -11.00762 xmax: 141.0194 ymax: 6.076941
#> geographic CRS: WGS 84
Now, we have to join the to the spatial data. This process aims to project our housing data with the geographical information.
#> GID_0 NAME_0 GID_1 NAME_1 NL_NAME_1 GID_2 NAME_2 NL_NAME_2
#> 1 IDN Indonesia IDN.1_1 Aceh <NA> IDN.1.2_1 Aceh Barat <NA>
#> 2 IDN Indonesia IDN.1_1 Aceh <NA> IDN.1.2_1 Aceh Barat <NA>
#> 3 IDN Indonesia IDN.1_1 Aceh <NA> IDN.1.2_1 Aceh Barat <NA>
#> 4 IDN Indonesia IDN.1_1 Aceh <NA> IDN.1.2_1 Aceh Barat <NA>
#> 5 IDN Indonesia IDN.1_1 Aceh <NA> IDN.1.2_1 Aceh Barat <NA>
#> 6 IDN Indonesia IDN.1_1 Aceh <NA> IDN.1.2_1 Aceh Barat <NA>
#> GID_3 NAME_3 VARNAME_3 NL_NAME_3 TYPE_3 ENGTYPE_3
#> 1 IDN.1.2.1_1 Arongan Lambalek <NA> <NA> Kecamatan Sub-district
#> 2 IDN.1.2.2_1 Bubon <NA> <NA> Kecamatan Sub-district
#> 3 IDN.1.2.3_1 Johan Pahlawan <NA> <NA> Kecamatan Sub-district
#> 4 IDN.1.2.4_1 Kaway Xvi <NA> <NA> Kecamatan Sub-district
#> 5 IDN.1.2.5_1 Meureubo <NA> <NA> Kecamatan Sub-district
#> 6 IDN.1.2.6_1 Pantai Ceuremen <NA> <NA> Kecamatan Sub-district
#> CC_3 HASC_3 geometry
#> 1 1107062 <NA> MULTIPOLYGON (((95.97953 4....
#> 2 1107061 <NA> MULTIPOLYGON (((96.16601 4....
#> 3 1107050 <NA> MULTIPOLYGON (((96.13205 4....
#> 4 1107080 <NA> MULTIPOLYGON (((96.16397 4....
#> 5 1107081 <NA> MULTIPOLYGON (((96.25119 4....
#> 6 1107082 <NA> MULTIPOLYGON (((96.18817 4....
We can do this by joining the information of the city (kota) & sub-district (kecamatan) to NAME_2 & NAME_3 variables respectively in idn. By doing left join with df_agg as left_table the result table will only include the Jabodetabek area (same as df_agg)
# Join data
df_agg <- df_agg %>%
left_join(idn, by = c("kota" = "NAME_2", "kecamatan" = "NAME_3"))
head(df_agg)#> # A tibble: 6 x 19
#> kota kecamatan harga_m2 total_listings GID_0 NAME_0 GID_1 NAME_1 NL_NAME_1
#> <chr> <chr> <dbl> <int> <chr> <chr> <chr> <chr> <chr>
#> 1 Beka~ Tarumaja~ 1.23e7 463 IDN Indon~ IDN.~ Jawa ~ <NA>
#> 2 Depok Beji 1.17e7 84 IDN Indon~ IDN.~ Jawa ~ <NA>
#> 3 Depok Cimanggis 1.17e7 110 IDN Indon~ IDN.~ Jawa ~ <NA>
#> 4 Depok Cinere 1.28e7 88 IDN Indon~ IDN.~ Jawa ~ <NA>
#> 5 Jaka~ Cengkare~ 1.30e7 455 IDN Indon~ IDN.~ Jakar~ <NA>
#> 6 Jaka~ Grogolpe~ 1.50e7 223 IDN Indon~ IDN.~ Jakar~ <NA>
#> # ... with 10 more variables: GID_2 <chr>, NL_NAME_2 <chr>, GID_3 <chr>,
#> # VARNAME_3 <chr>, NL_NAME_3 <chr>, TYPE_3 <chr>, ENGTYPE_3 <chr>,
#> # CC_3 <chr>, HASC_3 <chr>, geometry <MULTIPOLYGON [°]>
Now that our df_agg data have the geographic attributes attached, we can turn it into an sf object using st_as_sf() function:
#> Geometry set for 59 features
#> geometry type: MULTIPOLYGON
#> dimension: XY
#> bbox: xmin: 106.6325 ymin: -6.39886 xmax: 107.0317 ymax: -6.0409
#> geographic CRS: WGS 84
#> First 5 geometries:
Now, let’s plot the data to the map using leaflet packages.
# Leaflet
leaflet(df_agg) %>% # create map widget
addProviderTiles(providers$CartoDB.Positron) %>% # add basemap
addPolygons() # add polygons from `sf` data %Now color the polygons based on the house pricing for each kecamatan. First, we need to create a color palette to represent the data. In the chunk below, we create an object called pal which stores the colors generated from df_agg$harga_m2 values. The value passed to fill in palette is provided by ColorBrewer2 palettes.
pal <- colorNumeric(palette = "YlOrRd", domain = df_agg$harga_m2)
leaflet(df_agg) %>%
addProviderTiles(providers$CartoDB.Positron) %>%
addPolygons(
fillColor = ~pal(harga_m2),
fillOpacity = .8,
weight = 2,
color = "white"
)To make our map more interactive we can add feature like highlight and labels, so as the mouse passes over them, we can see the hover and the details about the sub district (kecamatan).
# Make labels variable for parameter in leaflet
labels <- glue::glue("
<b>{df_agg$kecamatan}, {df_agg$kota}</b>:<br> {round(df_agg$harga_m2/1e6, 2)} jt/m2 <br> Total house listed: {df_agg$total_listings}"
) %>% lapply(htmltools::HTML)We can add label and highlight argument inside the addPolygons() function.
# Add highlight and label to make map more interactive
leaflet(df_agg) %>%
addProviderTiles(providers$CartoDB.Positron) %>%
addPolygons(
label = labels,
fillColor = ~pal(harga_m2),
fillOpacity = .8,
weight = 2,
color = "white",
highlight = highlightOptions(
weight = 5,
color = "black",
bringToFront = TRUE,
opacity = 0.8
)
)As the final step, we might also need to add a legend to give information about the colors and intervals. To add the legend, we only a layer of addLegend() function:
leaflet(df_agg) %>%
addProviderTiles(providers$CartoDB.Positron) %>%
addPolygons(
label = labels,
fillColor = ~pal(harga_m2),
fillOpacity = .8,
weight = 2,
color = "white",
highlight = highlightOptions(
weight = 5,
color = "black",
bringToFront = TRUE,
opacity = 0.8
)
) %>%
addLegend(
pal = pal,
values = ~harga_m2,
opacity = 1,
title = "Harga/m2",
position = "bottomright"
)To separate DKI Jakarta’s area with the other area, I create a border.
Now add the border to the whole code. The full code will look like this:
leaflet(df_agg) %>%
addProviderTiles(providers$CartoDB.Positron) %>% # using `addProviderTiles()` instead of `addTiles()`
addPolygons(
label = labels,
labelOptions = labelOptions(
style = list(
"font-size"="13px",
"background-color"="black",
"color"="white"
)
),
weight = 2,
color = "white",
fillOpacity = .8,
fillColor = ~pal(harga_m2),
highlight = highlightOptions(
weight = 5,
color = "black",
bringToFront = TRUE,
sendToBack = TRUE,
opacity = 0.8)
) %>%
addPolylines(
data = border,
color = "darkred",
opacity = .8,
weight = 2.5
) %>%
addLegend(
pal = pal,
values = ~harga_m2,
opacity = 1,
title = "Harga/m2",
position = "bottomright"
) Now, I create a geographical density map to show the density of housing in Jabodetabek. Before doing that, we have to read the data first:
#> perumahan kota latitude longitude
#> 1 Bintaro Jaya Sektor 2 Jakarta Selatan -6.278108 106.7522
#> 2 Buaran Regency Jakarta Timur -6.232117 106.9251
#> 3 Casa Permata Hijau Jakarta Selatan -6.221427 106.7832
#> 4 Cempaka Residence Jakarta Selatan -6.291828 106.7779
#> 5 Green Lake City Jakarta Barat -6.177477 106.7113
#> 6 Green Lontar Jakarta Selatan -6.364691 106.7983
The geographical density map is presented using a heatmap:
An effective way to communicate data visualization is by creating a dashboard. Dashboard helps to communicate information intuitively and essential to support data-driven decision making. The dashboard made using Flexdashboard package
Here you can access simple Jabodetabek housing dashboard I created: Jabodetabek Property Analysis Dashboard
I add a Average Price per Kecamatan Table to the dashboard. The complete .Rmd for the dashboard look like this:
---
title: "Jabodetabek Property Analysis Dashboard"
output:
flexdashboard::flex_dashboard:
theme: readable
---
```{r setup, include=FALSE}
library(flexdashboard)
library(tidyverse)
library(glue)
library(scales)
library(sf)
library(plotly)
library(leaflet)
library(leaflet.extras)
# read data
idn <- st_read(dsn = "shp", layer = "idn")
df <- read.csv("data/listings.csv")
perum <- read.csv('data/perumahan.csv')
# aggregate
df_agg <- df %>%
mutate(
harga_m2 = harga / m2
) %>%
group_by(kota, kecamatan) %>%
summarise(harga_m2 = median(harga_m2),
total_listings= n()) %>%
ungroup() %>%
left_join(idn, by = c("kota" = "NAME_2", "kecamatan" = "NAME_3")) %>%
st_as_sf()
```
Column {.tabset}
-------------------------------------
### Distribution of House Price/m2 for each Kecamatan
```{r}
pal <- colorNumeric(palette = "YlOrRd", domain = df_agg$harga_m2)
labels <- glue::glue("
<b>{df_agg$kecamatan}, {df_agg$kota}</b>:<br> {round(df_agg$harga_m2/1e6, 2)} jt/m2 <br> Total house listed: {df_agg$total_listings}") %>% lapply(htmltools::HTML)
border <- df_agg %>%
filter(NAME_1 == "Jakarta Raya") %>%
group_by(NAME_1) %>%
summarise()
leaflet(df_agg) %>%
addProviderTiles(providers$CartoDB.Positron) %>%
addPolygons(
label = labels,
labelOptions = labelOptions(
style = list(
"font-size"="13px",
"background-color"="black",
"color"="white"
)
),
weight = 2,
color = "white",
fillOpacity = .8,
fillColor = ~pal(harga_m2),
highlight = highlightOptions(
weight = 5,
color = "black",
bringToFront = TRUE,
sendToBack = TRUE,
opacity = 0.8)
) %>%
addPolylines(
data = border,
color = "darkred",
opacity = .8,
weight = 3
) %>%
addLegend(
pal = pal,
values = ~harga_m2,
opacity = 1,
title = "Harga/m2",
position = "bottomright"
) %>%
setView(lat = -6.225337, lng = 106.841086, zoom=11.2)
```
### Housing Heatmap
```{r}
library(leaflet.extras)
leaflet(perum) %>%
addProviderTiles(providers$CartoDB.Voyager) %>%
addCircles(
label = ~perumahan,
color = "red"
) %>%
addHeatmap(
radius = 10
)%>%
setView(lat = -6.225337, lng = 106.841086, zoom=11.2)
```
Column
-------------------------------------
### Average Price per Kecamatan Table
```{r}
library(DT)
data <- df_agg %>%
as.data.frame() %>%
arrange(desc(harga_m2)) %>%
select(kota, kecamatan, harga_m2) %>%
mutate(harga_m2 = number(harga_m2, big.mark = ",")) %>%
rename(
Kota = kota,
Kecamatan = kecamatan,
`Price/m2` = harga_m2
)
DT::datatable(
data,
extensions = "Buttons",
options = list(
pageLength = 25,
dom = 'Bfrtip',
buttons = c('csv','excel','pdf')
)
)
```